Comparative Study of Biological Production in Eastern Boundary Upwelling Systems Using an Artificial Neural Network
نویسنده
چکیده
We used a feature selection method to reduce the number of potential drivers and retain only the most important NPP predictors. This was performed using a SOM-based algorithm relying on the SOM topology preserving property. This supplementary material provides the detail of our feature selection algorithm. Two subsets of drivers emerge as the most relevant for explaining NPP variability: those include the upwelling index and nitrate concentration at 50 m depth, in addition to three common drivers: the eddy kinetic energy, the mixed layer depth and the continental shelf width. 1 1 Feature selection method In this study we use SOM to perform the task of feature selection (see Laha (2005), Laine and Simil (2004) and Benabdeslem and Lebbah (2007) for other applications). The ability of SOM to assist in feature selection is a result of SOM’s topology preservation property, permitting to derive a quantitative measure of the relative importance of predictors to the target data to be explained. However, SOMs topology preservation ability is affected severely if the intrinsic dimension of the data (which is usually lower than the absolute dimension) is much higher than the dimension (here 2) of the SOM lattice Ritter and Schulten (1988). Because of this dimensionality mismatch the SOM folds and twists to achieve the mapping, which leads to higher violation of topology preservation. The intrinsic dimension of the data is artificially increased when the least important predictors are included. Conversely, a fully dependent variable improves the neighborhood relationship, resulting in a reduction of the topological violation. Therefore, by measuring for each subset of variables the corresponding topological violation as well as the improvement in topology preservation due to including the dependant variable, NPP, we were able to rank the different combinations of drivers according to their respective degree of relevance. More specifically, the set of drivers which ensures the lowest topological violation and the highest NPP-induced improvement of the topology preservation should provide the best compact representation of data with only the most relevant features. Our algorithm does an exhaustive search to find the best subset of variables providing the most compact mapping according to our topological violation criterion. The topology violation is measured as the proportion of all data vectors, for which first and second best-matching units (BMU) (i.e. the two most closest neurons) are not adjacent vectors on the map. This is traditionally referred to as the topographic error Kohonen (2000). The algorithm browses subsets with more than 3 drivers, as the topological violation criterion can not be used to discriminate between subsets with a number of variables approaching the map dimension (2). Figure 1 shows the results of this ranking. According to this criterion, two subsets of drivers appear to be the most relevant for explaining NPP variability: those include the upwelling index and nitrate concentration at 50 m depth, in addition to three common drivers: the eddy kinetic energy, the mixed layer depth
منابع مشابه
Comparative Study of Static and Dynamic Artificial Neural Network Models in Forecasting of Tehran Stock Exchange
During the recent decades, neural network models have been focused upon by researchers due to their more real performance and on this basis, different types of these models have been used in forecasting. Now, there is a question that which kind of these models has more explanatory power in forecasting the future processes of the stock. In line with this, the present paper made a comparison betw...
متن کاملIdentification of selected monogeneans using image processing, artificial neural network and K-nearest neighbor
Abstract Over the last two decades, improvements in developing computational tools made significant contributions to the classification of biological specimens` images to their correspondence species. These days, identification of biological species is much easier for taxonomist and even non-taxonomists due to the development of automated computer techniques and systems. In this study, we d...
متن کاملPredicting Force in Single Point Incremental Forming by Using Artificial Neural Network
In this study, an artificial neural network was used to predict the minimum force required to single point incremental forming (SPIF) of thin sheets of Aluminium AA3003-O and calamine brass Cu67Zn33 alloy. Accordingly, the parameters for processing, i.e., step depth, the feed rate of the tool, spindle speed, wall angle, thickness of metal sheets and type of material were selected as input and t...
متن کاملPrediction of Egg Production Using Artificial Neural Network
Artificial neural networks (ANN) have shown to be a powerful tool for system modeling in a wide range of applications. The focus of this study is on neural network applications to data analysis in egg production. An ANN model with two hidden layers, trained with a back propagation algorithm, successfully learned the relationship between the input (age of hen) and output (egg production) variabl...
متن کاملImproving biological activity prediction of protein kinase inhibitors using artificial neural network and partial least square methods
Introduction: Protein kinase causes many diseases, including cancer; therefore, inhibiting them plays an important role in the treatment of many diseases. Traditional discovery inhibitors of this enzyme is a time-consuming and costly process. Finding a reliable computer-aided drug discovery tools which can detect the inhibitors will reduce the cost. In this study, it is attempted to separate ki...
متن کاملImproving biological activity prediction of protein kinase inhibitors using artificial neural network and partial least square methods
Introduction: Protein kinase causes many diseases, including cancer; therefore, inhibiting them plays an important role in the treatment of many diseases. Traditional discovery inhibitors of this enzyme is a time-consuming and costly process. Finding a reliable computer-aided drug discovery tools which can detect the inhibitors will reduce the cost. In this study, it is attempted to separate ki...
متن کامل